refactor(e2e): migrate to external oadp-must-gather container image#2007
refactor(e2e): migrate to external oadp-must-gather container image#2007kaovilai wants to merge 3 commits intoopenshift:oadp-devfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughRemoves the local must-gather implementation (sources, build, module, and docs), deletes deprecated must-gather wrapper scripts, and updates Makefile and E2E tests to optionally build/push a must-gather container image from an external repo or run must-gather via a configurable image (MUST_GATHER_IMAGE). Tests discover the summary via a glob under the must-gather artifacts. Changes
Sequence Diagram(s)sequenceDiagram
participant Test as Test Runner
participant OC as oc CLI
participant Reg as Image Registry
participant K8s as Cluster API
participant FS as Filesystem
Test->>OC: run "oc adm must-gather --image=${MUST_GATHER_IMAGE} --dest-dir=/tmp/mg"
OC->>Reg: pull ${MUST_GATHER_IMAGE}
OC->>K8s: collect cluster resources/plugins
K8s-->>OC: resource responses
OC->>FS: write artifacts to /tmp/mg/clusters/<clusterID>/
Test->>FS: glob for oadp-must-gather-summary.md and validate contents
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 10 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (10 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Review rate limit: 9/10 reviews remaining, refill in 6 minutes. Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kaovilai The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/e2e/lib/apps.go (1)
399-410: Add bounds checking before slicing ClusterID.Line 409 will panic if
clusterVersion.Spec.ClusterIDis shorter than 8 characters. While cluster IDs are typically UUIDs (36 characters), defensive programming requires bounds checking.Apply this diff to add bounds checking:
clusterVersion := &clusterVersionList.Items[0] - clusterID := string(clusterVersion.Spec.ClusterID[:8]) + clusterIDStr := string(clusterVersion.Spec.ClusterID) + if len(clusterIDStr) < 8 { + return fmt.Errorf("cluster ID is too short: %s", clusterIDStr) + } + clusterID := clusterIDStr[:8]
🧹 Nitpick comments (1)
tests/e2e/lib/apps.go (1)
422-432: Consider logging which summary file is being read.When multiple matches exist (e.g., from previous test runs), line 423 silently uses the first match. Adding a log statement would help with debugging.
// Read and validate the summary + log.Printf("Reading must-gather summary from: %s", matches[0]) mustGatherSummaryContent, err := os.ReadFile(matches[0])
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
⛔ Files ignored due to path filters (1)
must-gather/go.sumis excluded by!**/*.sum
📒 Files selected for processing (24)
Makefile(1 hunks)docs/developer/testing/TESTING.md(1 hunks)must-gather/Dockerfile(0 hunks)must-gather/README.md(0 hunks)must-gather/cmd/main.go(0 hunks)must-gather/deprecated/gather_1h(0 hunks)must-gather/deprecated/gather_1h_essential(0 hunks)must-gather/deprecated/gather_24h(0 hunks)must-gather/deprecated/gather_24h_essential(0 hunks)must-gather/deprecated/gather_6h(0 hunks)must-gather/deprecated/gather_6h_essential(0 hunks)must-gather/deprecated/gather_72h(0 hunks)must-gather/deprecated/gather_72h_essential(0 hunks)must-gather/deprecated/gather_all(0 hunks)must-gather/deprecated/gather_all_essential(0 hunks)must-gather/deprecated/gather_metrics_dump(0 hunks)must-gather/deprecated/gather_with_timeout(0 hunks)must-gather/deprecated/gather_without_tls(0 hunks)must-gather/go.mod(0 hunks)must-gather/pkg/cli.go(0 hunks)must-gather/pkg/gather/gather.go(0 hunks)must-gather/pkg/gvk/gvk.go(0 hunks)must-gather/pkg/templates/summary.go(0 hunks)tests/e2e/lib/apps.go(2 hunks)
💤 Files with no reviewable changes (21)
- must-gather/deprecated/gather_metrics_dump
- must-gather/deprecated/gather_1h
- must-gather/deprecated/gather_72h
- must-gather/go.mod
- must-gather/deprecated/gather_72h_essential
- must-gather/deprecated/gather_24h_essential
- must-gather/deprecated/gather_all_essential
- must-gather/deprecated/gather_with_timeout
- must-gather/pkg/gather/gather.go
- must-gather/cmd/main.go
- must-gather/README.md
- must-gather/deprecated/gather_without_tls
- must-gather/deprecated/gather_6h
- must-gather/deprecated/gather_1h_essential
- must-gather/pkg/cli.go
- must-gather/deprecated/gather_all
- must-gather/deprecated/gather_6h_essential
- must-gather/Dockerfile
- must-gather/pkg/gvk/gvk.go
- must-gather/pkg/templates/summary.go
- must-gather/deprecated/gather_24h
🔇 Additional comments (4)
Makefile (1)
792-792: LGTM! Dependency simplified correctly.The removal of the
build-must-gatherdependency aligns with the PR objective to use an external container image instead of building locally.docs/developer/testing/TESTING.md (1)
28-29: LGTM! Documentation clearly describes the new environment variables.The documentation accurately reflects the new must-gather behavior with environment-driven image selection.
tests/e2e/lib/apps.go (2)
381-387: LGTM! Environment variable handling is correct.The implementation properly reads the
MUST_GATHER_IMAGEenvironment variable with a sensible default value, and includes helpful comments about version-specific usage.
389-397: LGTM! Command execution and error handling look good.The
oc adm must-gathercommand is constructed correctly with appropriate flags, and errors are captured with output for debugging.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/e2e/lib/apps.go (1)
411-425: Consider using a wildcard pattern for better robustness.The current pattern derivation logic (lines 415-418) attempts to mirror how
oc adm must-gathernames directories, but this could fail if the actual naming logic differs or if edge cases exist (e.g., registries with multiple dots, special characters in tags). The past review suggested using a simpler wildcard pattern, which would be more resilient.Apply this diff to use a wildcard pattern and add debugging information:
// Find the must-gather output directory - // oc adm must-gather creates a directory based on the image name with registry separators - // replaced by hyphens. E.g., quay.io/konveyor/oadp-must-gather:latest -> quay-io-konveyor-oadp-must-gather-* - // We need to derive the pattern from the actual image being used - imagePattern := strings.ReplaceAll(mustGatherImage, ":", "-") - imagePattern = strings.ReplaceAll(imagePattern, "/", "-") - imagePattern = strings.ReplaceAll(imagePattern, ".", "-") - pattern := filepath.Join(artifact_dir, imagePattern+"-*", "clusters", clusterID, "oadp-must-gather-summary.md") + // oc adm must-gather creates: <artifact_dir>/<image-based-dir>/clusters/<cluster-id>/ + // Use a wildcard pattern to match any must-gather output directory + pattern := filepath.Join(artifact_dir, "*", "clusters", clusterID, "oadp-must-gather-summary.md") matches, err := filepath.Glob(pattern) if err != nil { return fmt.Errorf("error finding must-gather summary: %w", err) } if len(matches) == 0 { - return fmt.Errorf("no must-gather summary found at pattern: %s", pattern) + // List what directories exist to help debug + dirs, _ := filepath.Glob(filepath.Join(artifact_dir, "*")) + return fmt.Errorf("no must-gather summary found at pattern: %s\nDirectories in artifact_dir: %v", pattern, dirs) }This approach is simpler, more robust, and provides better debugging information when the pattern fails to match.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (1)
tests/e2e/lib/apps.go(2 hunks)
🔇 Additional comments (3)
tests/e2e/lib/apps.go (3)
381-387: LGTM! Clean environment variable handling.The environment variable handling is well-implemented with a sensible default and helpful inline documentation for users who need version-specific images.
389-397: LGTM! Proper command execution and error handling.The command execution correctly uses
CombinedOutputand wraps errors with the command output for better debugging.
427-441: LGTM! Validation logic is sound.The summary file reading and validation correctly checks for the expected success message. The TODO comment appropriately flags future enhancement opportunities.
|
/retest ai-retester: The OADP E2E test failed because the "Mongo application CSI" test timed out while waiting for the The The |
Good, https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_oadp-operator/2007/pull-ci-openshift-oadp-operator-oadp-dev-4.19-e2e-test-aws/1985812568559587328/artifacts/e2e-test-aws/e2e/artifacts/Mongo%20application%20CSI/ still populated, which means this PR works. |
ac8bcc6 to
2ae6d45
Compare
|
New changes are detected. LGTM label has been removed. |
|
updated after rebase |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
tests/e2e/lib/apps.go (1)
409-409: Potential panic: validate ClusterID length before slicing.This is the same critical issue flagged in the previous review. The slice operation
[:8]will panic at runtime ifClusterIDis shorter than 8 characters.Apply this diff to safely handle shorter ClusterIDs:
- clusterID := string(clusterVersion.Spec.ClusterID[:8]) + clusterID := string(clusterVersion.Spec.ClusterID) + if len(clusterID) > 8 { + clusterID = clusterID[:8] + }
🧹 Nitpick comments (1)
tests/e2e/lib/apps.go (1)
423-425: Enhance error message with available directories for debugging.When the pattern fails to match, listing the actual directories in
artifact_dirwould help diagnose whether the issue is pattern mismatch or missing must-gather output.Apply this diff:
if len(matches) == 0 { - return fmt.Errorf("no must-gather summary found at pattern: %s", pattern) + // List what directories exist to help debug + dirs, _ := filepath.Glob(filepath.Join(artifact_dir, "*")) + return fmt.Errorf("no must-gather summary found at pattern: %s\nDirectories in artifact_dir: %v", pattern, dirs) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
⛔ Files ignored due to path filters (1)
must-gather/go.sumis excluded by!**/*.sum
📒 Files selected for processing (24)
Makefile(1 hunks)docs/developer/testing/TESTING.md(1 hunks)must-gather/Dockerfile(0 hunks)must-gather/README.md(0 hunks)must-gather/cmd/main.go(0 hunks)must-gather/deprecated/gather_1h(0 hunks)must-gather/deprecated/gather_1h_essential(0 hunks)must-gather/deprecated/gather_24h(0 hunks)must-gather/deprecated/gather_24h_essential(0 hunks)must-gather/deprecated/gather_6h(0 hunks)must-gather/deprecated/gather_6h_essential(0 hunks)must-gather/deprecated/gather_72h(0 hunks)must-gather/deprecated/gather_72h_essential(0 hunks)must-gather/deprecated/gather_all(0 hunks)must-gather/deprecated/gather_all_essential(0 hunks)must-gather/deprecated/gather_metrics_dump(0 hunks)must-gather/deprecated/gather_with_timeout(0 hunks)must-gather/deprecated/gather_without_tls(0 hunks)must-gather/go.mod(0 hunks)must-gather/pkg/cli.go(0 hunks)must-gather/pkg/gather/gather.go(0 hunks)must-gather/pkg/gvk/gvk.go(0 hunks)must-gather/pkg/templates/summary.go(0 hunks)tests/e2e/lib/apps.go(2 hunks)
💤 Files with no reviewable changes (21)
- must-gather/deprecated/gather_all_essential
- must-gather/deprecated/gather_metrics_dump
- must-gather/Dockerfile
- must-gather/deprecated/gather_72h
- must-gather/deprecated/gather_24h_essential
- must-gather/deprecated/gather_1h
- must-gather/deprecated/gather_6h
- must-gather/deprecated/gather_1h_essential
- must-gather/go.mod
- must-gather/deprecated/gather_all
- must-gather/pkg/cli.go
- must-gather/deprecated/gather_24h
- must-gather/deprecated/gather_without_tls
- must-gather/deprecated/gather_with_timeout
- must-gather/cmd/main.go
- must-gather/pkg/gather/gather.go
- must-gather/README.md
- must-gather/pkg/gvk/gvk.go
- must-gather/deprecated/gather_72h_essential
- must-gather/deprecated/gather_6h_essential
- must-gather/pkg/templates/summary.go
🚧 Files skipped from review as they are similar to previous changes (2)
- docs/developer/testing/TESTING.md
- Makefile
🔇 Additional comments (1)
tests/e2e/lib/apps.go (1)
381-397: LGTM: Environment-driven must-gather execution.The environment variable setup and command execution are well-implemented:
- Sensible default image with clear upgrade path via
MUST_GATHER_IMAGE- Proper use of
oc adm must-gatherwith explicit flags- Comprehensive error handling with command output included
|
/retest ai-retester: The end-to-end (e2e) test Why the Mongo test blew up The entire CI run was green until we hit the Mongo application CSI test. The test had just finished installing the Mongo database and the In short, the todolist container could not start (most likely the image Short failure summary
So, the single failure occurs when the Mongo T‑odo‑list pod failed to start (likely a race or resource issue), and the must‑gather step also failed to locate its summary file. |
|
/retest |
1 similar comment
|
/retest |
|
ugh.. AI reviews :) If this not running the must-gather atm in ci? I'll look |
|
Feedback is to build from source of a must-gather repo clone/local-dir so that must-gather can be developed on and tested in e2e w/o having to push an image to registry. |
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Remove local must-gather directory and build process in favor of using the external quay.io/konveyor/oadp-must-gather:latest image via oc adm must-gather. This eliminates architecture mismatch issues and keeps must-gather code in its dedicated repository. Changes: - Updated RunMustGather() in tests/e2e/lib/apps.go to use oc adm must-gather - Added MUST_GATHER_IMAGE env var (defaults to quay.io/konveyor/oadp-must-gather:latest) - Removed build-must-gather target from Makefile - Removed entire must-gather/ directory (3,174 lines deleted) - Updated documentation in TESTING.md The SKIP_MUST_GATHER flag is preserved for skipping must-gather collection. Version-specific images can be used by setting MUST_GATHER_IMAGE env var. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
The directory pattern was hardcoded to 'quay-io-konveyor-oadp-must-gather-*' which breaks when using custom images via MUST_GATHER_IMAGE env var. Now dynamically derives the pattern from the actual image name by replacing registry separators (. / :) with hyphens to match oc adm must-gather's directory naming convention. Examples: - quay.io/konveyor/oadp-must-gather:latest -> quay-io-konveyor-oadp-must-gather-latest-* - docker.io/myuser/custom:v1 -> docker-io-myuser-custom-v1-* 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2ae6d45 to
4b00056
Compare
|
/test all |
Add build-must-gather Makefile target and MUST_GATHER_REPO/MUST_GATHER_BRANCH env vars to build must-gather from GitHub source, push to ttl.sh, and use in e2e tests. Allows testing must-gather PRs without manually publishing images. Also improves RunMustGather: safe clusterID slice, wildcard glob pattern, sorted matches for newest, and debug dir listing on failure. Generated with [Claude Code](https://claude.ai/code) via [Happy](https://happy.engineering) Co-Authored-By: Claude <noreply@anthropic.com> Co-Authored-By: Happy <yesreply@happy.engineering> Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@Makefile`:
- Around line 983-994: The help comment for the build-must-gather target is
inconsistent with the actual default; update the comment or the variable so they
match: either change the comment text "Uses MUST_GATHER_BRANCH (default: main)"
to "Uses MUST_GATHER_BRANCH (default: oadp-dev)" to reflect MUST_GATHER_BRANCH
?= oadp-dev, or change the default assignment (MUST_GATHER_BRANCH ?= oadp-dev)
to MUST_GATHER_BRANCH ?= main so the comment stays accurate; edit the Makefile
comment adjacent to the build-must-gather target and ensure MUST_GATHER_BRANCH
is the intended default.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: eef1a2d9-7cd1-45be-9d8f-b2494a5bf09a
📒 Files selected for processing (3)
Makefiledocs/developer/testing/TESTING.mdtests/e2e/lib/apps.go
✅ Files skipped from review due to trivial changes (1)
- docs/developer/testing/TESTING.md
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/e2e/lib/apps.go
| .PHONY: build-must-gather | ||
| build-must-gather: ## Build must-gather image from GitHub source. Requires MUST_GATHER_REPO (e.g., openshift/oadp-must-gather). Uses MUST_GATHER_BRANCH (default: main). | ||
| ifeq ($(MUST_GATHER_REPO),) | ||
| $(error MUST_GATHER_REPO is required (e.g., openshift/oadp-must-gather)) | ||
| endif | ||
| $(eval MUST_GATHER_TMP := $(shell mktemp -d)) | ||
| git clone --depth=1 --branch $(MUST_GATHER_BRANCH) https://github.com/$(MUST_GATHER_REPO).git $(MUST_GATHER_TMP) | ||
| $(CONTAINER_TOOL) build --load -t $(MUST_GATHER_IMAGE) -f $(MUST_GATHER_TMP)/Dockerfile $(MUST_GATHER_TMP) | ||
| $(CONTAINER_TOOL) push $(MUST_GATHER_IMAGE) | ||
| rm -rf $(MUST_GATHER_TMP) | ||
| @echo "Must-gather image built and pushed: $(MUST_GATHER_IMAGE)" | ||
|
|
There was a problem hiding this comment.
Documentation inconsistency: comment says "default: main" but actual default is "oadp-dev".
The help comment on line 984 states Uses MUST_GATHER_BRANCH (default: main) but line 935 defines MUST_GATHER_BRANCH ?= oadp-dev.
📝 Proposed fix
.PHONY: build-must-gather
-build-must-gather: ## Build must-gather image from GitHub source. Requires MUST_GATHER_REPO (e.g., openshift/oadp-must-gather). Uses MUST_GATHER_BRANCH (default: main).
+build-must-gather: ## Build must-gather image from GitHub source. Requires MUST_GATHER_REPO (e.g., openshift/oadp-must-gather). Uses MUST_GATHER_BRANCH (default: oadp-dev).
ifeq ($(MUST_GATHER_REPO),)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@Makefile` around lines 983 - 994, The help comment for the build-must-gather
target is inconsistent with the actual default; update the comment or the
variable so they match: either change the comment text "Uses MUST_GATHER_BRANCH
(default: main)" to "Uses MUST_GATHER_BRANCH (default: oadp-dev)" to reflect
MUST_GATHER_BRANCH ?= oadp-dev, or change the default assignment
(MUST_GATHER_BRANCH ?= oadp-dev) to MUST_GATHER_BRANCH ?= main so the comment
stays accurate; edit the Makefile comment adjacent to the build-must-gather
target and ensure MUST_GATHER_BRANCH is the intended default.
|
/test tide |
|
/test all |
|
@kaovilai: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
Remove local must-gather directory and build process in favor of using the external
quay.io/konveyor/oadp-must-gather:latestimage viaoc adm must-gather. This eliminates architecture mismatch issues and keeps must-gather code in its dedicated repository at https://github.com/openshift/oadp-must-gather.Changes
RunMustGather()intests/e2e/lib/apps.goto useoc adm must-gathercommandMUST_GATHER_IMAGEenvironment variable (defaults toquay.io/konveyor/oadp-must-gather:latest)build-must-gathertarget from Makefilemust-gather/directory (3,174 lines deleted)docs/developer/testing/TESTING.mdBenefits
MUST_GATHER_IMAGEenv varSKIP_MUST_GATHERflag preserved for skipping collectionUsage
Default behavior (uses latest):
With custom image:
Skip must-gather:
Fixes #2005
Testing
Note
Responses generated with Claude
Summary by CodeRabbit
Refactor
New Features
Documentation
Chores